Agglomerative Mean-Shift Clustering via Query Set Compression
نویسندگان
چکیده
Mean-Shift (MS) is a powerful non-parametric clustering method. Although good accuracy can be achieved, its computational cost is particularly expensive even on moderate data sets. In this paper, for the purpose of algorithm speedup, we develop an agglomerative MS clustering method called Agglo-MS, along with its mode-seeking ability and convergence property analysis. Our method is built upon an iterative query set compression mechanism which is motivated by the quadratic bounding optimization nature of MS. The whole framework can be efficiently implemented in linear running time complexity. Furthermore, we show that the pairwise constraint information can be naturally integrated into our framework to derive a semi-supervised non-parametric clustering method. Extensive experiments on toy and real-world data sets validate the speedup advantage and numerical accuracy of our method, as well as the superiority of its semi-supervised version.
منابع مشابه
Comparison of Agglomerative and Partitional Document Clustering Algorithms
Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters, and in greatly improving the retrieval performance either via cluster-driven dimensionality reduction, term-weighting, or query expansion. This ever-increasing importance of do...
متن کاملAgglomerative Mean Shift Cluster Using Shortest Path and Fuzzification Algorithm
In this research paper, an agglomerative mean shift with fuzzy clustering algorithm for numerical data and image data, an extension to the standard fuzzy C-Means algorithm by introducing a penalty term to the objective function to make the clustering process not sensitive to the initial cluster centers. The new algorithm of Shortest path and Fuzzification algorithm can produce more consistent c...
متن کاملImplementation of Hybrid Clustering Algorithm with Enhanced K-Means and Hierarchal Clustering
We are propose a hybrid clustering method, the methodology combines the strengths of both partitioning and agglomerative clustering methods. Clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularit...
متن کاملAnisotropic Agglomerative Adaptive Mean-Shift
Mean Shift today, is widely used for mode detection and clustering. The technique though, is challenged in practice due to assumptions of isotropicity and homoscedasticity. We present an adaptive Mean Shift methodology that allows for full anisotropic clustering, through unsupervised local bandwidth selection. The bandwidth matrices evolve naturally, adapting locally through agglomeration, and ...
متن کاملImproving Search Engine For A Digital Library
This project introduces a novel approach for using click through data to discover clusters of similar queries and similar URLs. One can simply observe that users reformulate their queries to find a desirable result. We define this sequence of queries as a query chain. Our data set consists of records containing user id, query term, query date and time, clicked item URL and clicked item rank if ...
متن کامل